Etymological Wordnet: Tracing The History of Words

نویسنده

  • Gerard de Melo
چکیده

Research on the history of words has led to remarkable insights about language and also about the history of human civilization more generally. This paper presents the Etymological Wordnet, the first database that aims at making word origin information available as a large, machine-readable network of words in many languages. The information in this resource is obtained from Wiktionary. Extracting a network of etymological information from Wiktionary requires significant effort, as much of the etymological information is only given in prose. We rely on custom pattern matching techniques and mine a large network with over 500,000 word origin links as well as over 2 million derivational/compositional links.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Universal Multilingual Knowledge Bases

Lexical, ontological, as well as encyclopedic knowledge is increasingly being encoded in machine-readable form. This paper deals with knowledge representation in multilingual settings. It begins by proposing a generic graph-based knowledge base framework, and then, in three case studies, explains how preexisting knowledge can be cast into this framework. The first case study involves enriching ...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Vocabulary Instruction Method and Specialized Reading Comprehension: Build a Bridge or Wash it away

The present study aimed to examine and compare the impact of teaching economic terms through etymological elaboration with three more conventional methods of vocabulary instruction in ESP courses in Iran, that is, teaching through contextual definitions, L1 translation, and implicit instruction on the learners' general comprehension of economic texts and their understanding of author's opinion....

متن کامل

ABHIDHA: An extended WordNet for Indo-Aryan Languages

A lexical knowledge base is an important component of any intelligent information processing system. The WordNet developed at the Cognitive Systems Laboratories at Princeton has served as a lexical reference system for natural language processing activities. The Indian language based activities at our institute mainly in text-to-speech synthesis and natural language generation from iconic input...

متن کامل

Annotating Cognates and Etymological Origin in Turkic Languages

Turkic languages exhibit extensive and diverse etymological relationships among lexical items. These relationships make the Turkic languages promising for exploring automated translation lexicon induction by leveraging cognate and other etymological information. However, due to the extent and diversity of the types of relationships between words, it is not clear how to annotate such information...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014